********************************************************************************
* Author: Jonathan Rogers                                                      *
* Date Last Updated: 26 June 2023 (just added more thorough comments)          *
* Stata Version 17.0                                                           *
*		- You will hit errors at line 107 if you have Stata 16 or earlier.     *
*		- Stata 16 versions of same commands are on lines 99-103               *
* Datasets: SienaMaster.dta, BalanceCheck.dta, WinningsCompare.dta             *
********************************************************************************

* The original raw data files were:

* mytitox21042016_table_classparta.csv
* mytitox21042016_table_classpartb1.csv
* mytitox21042016_table_classtoken.csv
* nameid.csv
* namegender.csv

* These cannot be shared, as they contain names and individually identifying
* information about subjects (the vast majority of whom are minors). The files
* presented here were created by merging the raw files together and then
* dropping variables that could be used to identify student subjects. Variables
* for each dataset are defined in the variable labels.

clear all 
capture log close
log using Analysis_Final.log, replace
set more off

use SienaMaster

describe
summarize


***************************
**#  Creating New Variables
***************************

* Creating variable for order in which a subject listed a particular link.
* 1 = first friend/aquaintance listed, 2 = second, 3 = third, ...

gen orderlist = .
replace orderlist = 1 if subid != subid[_n-1]
replace orderlist = orderlist[_n-1] +1 if subid == subid[_n-1]

* Creating variable for number of links listed by a subject.  Loop repeats
* the line of code 300 times.  30 is above the maximum number of names a subject
* could see on their screen.  I chose 300, because of an issue discussed below
* where a small number of subjects participated multiple times.

gen numlink = .
replace numlink = orderlist if subid != subid[_n+1]
forvalues i = 1/300 {
replace numlink = numlink[_n+1] if subid == subid[_n+1]
}

* Creating variable for subjects who could only list 3 links

gen limited3 = .
replace limited3 = 0 if numlink > 3
replace limited3 = 1 if numlink == 3


* Creating variable for subjects who could only list 5 or fewer links

gen limited5 = .
replace limited5 = 0 if numlink > 5
replace limited5 = 1 if numlink <= 5

* Creating variable for subjects who could only list 7 or fewer links

gen limited7 = .
replace limited7 = 0 if numlink > 7
replace limited7 = 1 if numlink <= 7


summarize

* Some subjects have an impossibly large number of links.  I looked at the 
* data and these subjects somehow participated multiple times.  Perhaps they
* reloaded the game during the session?

tab subid if numlink > 30

drop if subid == 9757 & orderlist > 27 // 52 links
drop if subid == 9982 & orderlist > 18 // 36 links
drop if subid == 10871 // 60 links
drop if subid == 10899 // 40 links
drop if subid == 13255 // 300 links
drop if subid == 14381 // 48 links

summarize

*************
**#  Analysis
*************

tab hours close if classpart == 1 // Table 1 Left Panel
tab hours close if classpart == 2 & limited5 == 1 // Table 1 Right Panel

tab close if classpart == 1 // Table 2 First Row
*tab close if classpart == 2
*tab close if classpart == 2 & limited7 == 1
tab close if classpart == 2 & limited5 == 1 // Table 2 Second Row
tab close if classpart == 2 & limited3 == 1

codebook subid if classpart == 1, compact
*codebook subid if classpart == 2, compact
*codebook subid if classpart == 2 & limited7 == 1, compact
codebook subid if classpart == 2 & limited5 == 1, compact
codebook subid if classpart == 2 & limited3 == 1, compact

tab hours if classpart == 1 // Table 3 First Column
tab hours if classpart == 2 & limited5 == 1 // Table 3 Second Column

* Use this if you have Stata 16 or earlier

*table close if classpart == 1, contents(mean hours semean hours) row
*table close if classpart == 2, contents(mean hours) row
*table close if classpart == 2 & limited7 == 1, contents(mean hours) row
*table close if classpart == 2 & limited5 == 1, contents(mean hours semean hours) row
*table close if classpart == 2 & limited3 == 1, contents(mean hours semean hours) row

* Use this if you have Stata 17 or later

table close if classpart == 1, statistic(mean hours) // Table 4 First Column
table close if classpart == 2, statistic(mean hours)
table close if classpart == 2 & limited7 == 1, statistic(mean hours)
table close if classpart == 2 & limited5 == 1, statistic(mean hours) // Table 4 Second Column
table close if classpart == 2 & limited3 == 1, statistic(mean hours)


bysort subid hours classpart : gen n = _N if _n == 1 
su n if hours == 1 & classpart == 1
su n if hours == 2 & classpart == 1
su n if hours == 3 & classpart == 1
su n if hours == 4 & classpart == 1
su n if hours == 5 & classpart == 1

su n if hours == 1 & classpart == 2 & limited5 == 1
su n if hours == 2 & classpart == 2 & limited5 == 1
su n if hours == 3 & classpart == 2 & limited5 == 1
su n if hours == 4 & classpart == 2 & limited5 == 1
su n if hours == 5 & classpart == 2 & limited5 == 1

******************
**#  Balance check
******************

clear all
use BalanceCheck

describe
summarize

tab gender if classpart == 1
tab gender if classpart == 2

tab classnum if classpart == 1
tab classnum if classpart == 2


************************
**#  Winnings comparison
************************

clear all
use WinningsCompare

summarize unspent if classpart == 1
summarize unspent if classpart == 2

summarize coupons if classpart == 1
summarize coupons if classpart == 2


tab unspent if classpart == 1
tab unspent if classpart == 2

tab coupons if classpart == 1
tab coupons if classpart == 2

log close 
